Search CORE

78 research outputs found

Efficient spatial keyword query processing on geo-textual data

Author: Zheng Bolong
Publication venue: 'University of Queensland Library'
Publication date: 27/03/2017
Field of study

University of Queensland eSpace

Efficient Distributed Clustering Algorithms on Star-Schema Heterogeneous Graphs

Author: Chen Lu
Gao Yunjun
Huang Xingrui
Jensen Christian S.
Zheng Bolong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/12/2020
Field of study

VBN

SOUP:A fleet management system for passenger demand prediction and competitive taxi supply

Author: Chen Lu
Hu Qi
Jensen Christian S.
Ming Lingfeng
Xi Ruijie
Zheng Bolong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2021
Field of study

VBN

Online Trichromatic Pickup and Delivery Scheduling in Spatial Crowdsourcing

Author: Chen Lu
Huang Chenze
Hung Nguyen Quoc Viet
Jensen Christian S.
Li Guohui
Liu Guanfeng
Zheng Bolong
Zheng Kai
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Crossref

VBN

PM-LSH:A fast and accurate LSH framework for high-dimensional approximate NN search

Author: Hung Nguyen Quoc Viet
Jensen Christian S.
Liu Hang
Weng Lianggui
Xi Zhao
Zheng Bolong
Publication venue: 'VLDB Endowment'
Publication date: 01/01/2020
Field of study

VBN

SpeakNav:A voice-based navigation system via route description language understanding

Author: Bi Lei
Cao Juan
Jensen Christian S.
Li Guohui
Viet Hung Nguyen Quoc
Zheng Bolong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2021
Field of study

VBN

Speaknav: Voice-based route description language understanding for template-driven path search

Author: Bi Lei
Cao Juan
Chai Hua
Chen Lu
Fang Jun
Gao Yunjun
Jensen Christian S.
Zheng Bolong
Zhou Xiaofang
Publication venue: 'VLDB Endowment'
Publication date: 01/01/2021
Field of study

VBN

Exploring Data Partitions for What-if Analysis

Author: Nguyen Quoc Viet Hung
Nguyen Thanh Tam
Stantic Bela
Weidlich Matthias
Yin Hongzhi
Zheng Bolong
Zheng Kai
Publication venue
Publication date: 05/09/2018
Field of study

What-if analysis is a data-intensive exploration to inspect how changes in a set of input parameters of a model influence some outcomes. It is motivated by a user trying to understand the sensitivity of a model to a certain parameter in order to reach a set of goals that are defined over the outcomes. To avoid an exploration of all possible combinations of parameter values, efficient what-if analysis calls for a partitioning of parameter values into data ranges and a unified representation of the obtained outcomes per range. Traditional techniques to capture data ranges, such as histograms, are limited to one outcome dimension. Yet, in practice, what-if analysis often involves conflicting goals that are defined over different dimensions of the outcome. Working on each of those goals independently cannot capture the inherent trade-off between them. In this paper, we propose techniques to recommend data ranges for what-if analysis, which capture not only data regularities, but also the trade-off between conflicting goals. Specifically, we formulate a parametric data partitioning problem and propose a method to find an optimal solution for it. Targeting scalability to large datasets, we further provide a heuristic solution to this problem. By theoretical and empirical analyses, we establish performance guarantees in terms of runtime and result quality

Infoscience - École polytechnique fédérale de Lausanne

Probesim: scalable single-source and top-k simrank computations on dynamic graphs

Author: He Xiaodong
Liu Yu
Lu Jiaheng
Wei Zhewei
Xiao Xiaokui
Zhen Kai
Zheng Bolong
Publication venue
Publication date: 01/09/2017
Field of study

Single-source and top-k SimRank queries are two important types of similarity search in graphs with numerous applications in web mining, social network analysis, spam detection, etc. A plethora of techniques have been proposed for these two types of queries, but very few can efficiently support similarity search over large dynamic graphs, due to either significant preprocessing time or large space overheads. This paper presents ProbeSim, an index-free algorithm for single-source and top-k SimRank queries that provides a nontrivial theoretical guarantee in the absolute error of query results. ProbeSim estimates SimRank similarities without precomputing any indexing structures, and thus can naturally support real-time SimRank queries on dynamic graphs. Besides the theoretical guarantee, ProbeSim also offers satisfying practical efficiency and effectiveness due to non-trivial optimizations. We conduct extensive experiments on a number of benchmark datasets, which demonstrate that our solutions outperform the existing methods in terms of efficiency and effectiveness. Notably, our experiments include the first empirical study that evaluates the effectiveness of SimRank algorithms on graphs with billion edges, using the idea of pooling.Peer reviewe

arXiv.org e-Print Archive

Helsingin yliopiston digitaalinen arkisto

User Guidance for Efficient Fact Checking

Author: Nguyen Quoc Viet Hung
Nguyen Thanh Tam
Stantic Bela
Weidlich Matthias
Yin Hongzhi
Zheng Bolong
Publication venue
Publication date: 28/03/2019
Field of study

The Web constitutes a valuable source of information. In recent years, it fostered the construction of large-scale knowledge bases, such as Freebase, YAGO, and DBpedia. The open nature of the Web, with content potentially being generated by everyone, however, leads to inaccuracies and misinformation. Construction and maintenance of a knowledge base thus has to rely on fact checking, an assessment of the credibility of facts. Due to an inherent lack of ground truth information, such fact checking cannot be done in a purely automated manner, but requires human involvement. In this paper, we propose a comprehensive framework to guide users in the validation of facts, striving for a minimisation of the invested effort. Our framework is grounded in a novel probabilistic model that combines user input with automated credibility inference. Based thereon, we show how to guide users in fact checking by identifying the facts for which validation is most beneficial. Moreover, our framework includes techniques to reduce the manual effort invested in fact checking by determining when to stop the validation and by supporting efficient batching strategies. We further show how to handle fact checking in a streaming setting. Our experiments with three real-world datasets demonstrate the efficiency and effectiveness of our framework: A knowledge base of high quality, with a precision of above 90\%, is constructed with only a half of the validation effort required by baseline techniques

Infoscience - École polytechnique fédérale de Lausanne